Assessing the Impact of Thesaurus-Based Expansion Techniques in QA-Centric IR
نویسندگان
چکیده
In this paper, we assess the impact of using thesaurus-based query expansion methods, at the Information Retrieval (IR) stage of a Question Answering (QA) system. We focus on expanding queries for questions regarding actions and events, where verbs have particularly important roles. Two different thesaurus are used: the OpenOffice thesaurus and an automatically generated verb thesaurus. The performance of thesaurus-based methods is compared against what is obtained by (i) executing no expansion and (ii) applying a simple query generalization method. Results show that using thesaurus-based approaches helps to improve retrieval recall, while keeping satisfactory precision. However, we confirm that positive impact for the final QA performance is mostly achieved due to increase in recall, which can also obtained by alternative and simpler methods. Nevertheless, thesaurus-based expansion helps controlling the number of text passages retrieved, thus selectively reducing the computational load in the answer extraction stage.
منابع مشابه
The GeoTALP-IR System at GeoCLEF-2005: Experiments Using a QA-based IR System, Linguistic Analysis, and a Geographical Thesaurus
This paper describes GeoTALP-IR system, a Geographical Information Retrieval (GIR) system. The system is described and evaluated in the context of our participation in the CLEF 2005 GeoCLEF Monolingual English task. The GIR system is based on Lucene and uses a modified version of the Passage Retrieval module of the TALP Question Answering (QA) system presented at CLEF 2004 and TREC 2004 QA eval...
متن کاملSyntactic Clues and Lexical Resources in Question-Answering
CL Research's question-answering system (DIMAP-QA) for TREC-9 significantly extends its semantic relation triple (logical form) technology in which documents are fully parsed and databases built around discourse entities. This extension further exploits parsing output, most notably appositives and relative clauses, which are quite useful for question-answering. Further, DIMAP-QA integrated mach...
متن کاملExperiments with Query Expansion in the RAPOSA (FOX) Question Answering System
In this paper we present the results of applying a statistical query expansion method on the retrieval stage of a QA system for Portuguese (RAPOSA). Our approach involves expanding queries for event-related or action-related factoid questions using a verb thesaurus automatically generated using information extracted from large corpora. We show that our expansion approach improves QA recall when...
متن کاملImpact of Controlled and Free Language Use in Retrieving Articles from the ProQuest and Science Direct Databases
Abstract Introduction: The growth and expansion of the Internet has changed the way information is accessed and many facilities have been created on the Web to facilitate and expedite information locating. Objective: To identify the impact of keyword documentation using the medical thesaurus on the retrieval of articles from Proquest and Science Direct databases. Materials and Methods:The pr...
متن کاملThe Value of an in-Domain Lexicon in genomics QA
This paper demonstrates that a large-scale lexicon tailored for the biology domain is effective in improving question analysis for genomics Question Answering (QA). We use the TREC Genomics Track data to evaluate the performance of different question analysis methods. It is hard to process textual information in biology, especially in molecular biology, due to a huge number of technical terms w...
متن کامل